Overview

Dataset statistics

Number of variables24
Number of observations64404
Missing cells707067
Missing cells (%)45.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory46.8 MiB
Average record size in memory762.7 B

Variable types

Categorical11
Numeric12
Unsupported1

Alerts

MarcaVehiculo__c has constant value "97.0" Constant
MdeloVehiculo__c has constant value "999.0" Constant
n_prod_prev is highly correlated with total_siniestros and 1 other fieldsHigh correlation
total_siniestros is highly correlated with n_prod_prev and 2 other fieldsHigh correlation
total_pagado_smmlv is highly correlated with n_prod_prev and 2 other fieldsHigh correlation
anios_ultimo_siniestro is highly correlated with total_siniestros and 1 other fieldsHigh correlation
Activos__c is highly correlated with AnnualRevenue and 1 other fieldsHigh correlation
AnnualRevenue is highly correlated with Activos__c and 1 other fieldsHigh correlation
EgresosAnuales__c is highly correlated with Activos__c and 1 other fieldsHigh correlation
total_siniestros is highly correlated with total_pagado_smmlvHigh correlation
total_pagado_smmlv is highly correlated with total_siniestrosHigh correlation
anios_ultimo_siniestro is highly correlated with AnnualRevenue and 1 other fieldsHigh correlation
AnnualRevenue is highly correlated with anios_ultimo_siniestro and 1 other fieldsHigh correlation
EgresosAnuales__c is highly correlated with anios_ultimo_siniestro and 1 other fieldsHigh correlation
total_siniestros is highly correlated with total_pagado_smmlv and 1 other fieldsHigh correlation
total_pagado_smmlv is highly correlated with total_siniestros and 1 other fieldsHigh correlation
anios_ultimo_siniestro is highly correlated with total_siniestros and 1 other fieldsHigh correlation
AnnualRevenue is highly correlated with EgresosAnuales__cHigh correlation
EgresosAnuales__c is highly correlated with AnnualRevenueHigh correlation
churn is highly correlated with n_prod_prev and 2 other fieldsHigh correlation
FechaInicioVigencia__ctrim is highly correlated with MarcaVehiculo__c and 1 other fieldsHigh correlation
n_prod_prev is highly correlated with churn and 3 other fieldsHigh correlation
EstadoCivil__pc is highly correlated with MarcaVehiculo__c and 3 other fieldsHigh correlation
MarcaVehiculo__c is highly correlated with churn and 9 other fieldsHigh correlation
CodigoTipoAsegurado__c is highly correlated with MarcaVehiculo__c and 1 other fieldsHigh correlation
tipo_poliza_name is highly correlated with MarcaVehiculo__c and 3 other fieldsHigh correlation
TipoVehiculo__c is highly correlated with n_prod_prev and 5 other fieldsHigh correlation
Genero__pc is highly correlated with EstadoCivil__pc and 3 other fieldsHigh correlation
MdeloVehiculo__c is highly correlated with churn and 9 other fieldsHigh correlation
tipo_prod_desc is highly correlated with MarcaVehiculo__c and 2 other fieldsHigh correlation
CodigoTipoAsegurado__c is highly correlated with n_prod_prev and 1 other fieldsHigh correlation
PuntoVenta__c is highly correlated with ClaseVehiculo__c and 1 other fieldsHigh correlation
tipo_poliza_name is highly correlated with tipo_prod_desc and 9 other fieldsHigh correlation
tipo_prod_desc is highly correlated with tipo_poliza_name and 2 other fieldsHigh correlation
ClaseVehiculo__c is highly correlated with PuntoVenta__c and 9 other fieldsHigh correlation
TipoVehiculo__c is highly correlated with PuntoVenta__c and 9 other fieldsHigh correlation
NumeroPoliza__c is highly correlated with tipo_poliza_name and 6 other fieldsHigh correlation
FechaInicioVigencia__ctrim is highly correlated with tipo_poliza_name and 3 other fieldsHigh correlation
churn is highly correlated with ClaseVehiculo__c and 3 other fieldsHigh correlation
n_prod_prev is highly correlated with CodigoTipoAsegurado__c and 10 other fieldsHigh correlation
total_siniestros is highly correlated with tipo_poliza_name and 3 other fieldsHigh correlation
total_pagado_smmlv is highly correlated with CodigoTipoAsegurado__c and 7 other fieldsHigh correlation
anios_ultimo_siniestro is highly correlated with Activos__c and 2 other fieldsHigh correlation
Activos__c is highly correlated with n_prod_prev and 4 other fieldsHigh correlation
AnnualRevenue is highly correlated with n_prod_prev and 3 other fieldsHigh correlation
MontoAnual__c is highly correlated with tipo_poliza_name and 4 other fieldsHigh correlation
OtrosIngresos__c is highly correlated with Activos__c and 1 other fieldsHigh correlation
EgresosAnuales__c is highly correlated with n_prod_prev and 4 other fieldsHigh correlation
EstadoCivil__pc is highly correlated with ClaseVehiculo__c and 3 other fieldsHigh correlation
Genero__pc is highly correlated with tipo_poliza_name and 2 other fieldsHigh correlation
edad is highly correlated with ClaseVehiculo__c and 1 other fieldsHigh correlation
MarcaVehiculo__c has 12899 (20.0%) missing values Missing
MdeloVehiculo__c has 12899 (20.0%) missing values Missing
n_prod_prev has 61750 (95.9%) missing values Missing
total_siniestros has 60246 (93.5%) missing values Missing
total_pagado_smmlv has 60246 (93.5%) missing values Missing
anios_ultimo_siniestro has 60246 (93.5%) missing values Missing
Activos__c has 58200 (90.4%) missing values Missing
AnnualRevenue has 58200 (90.4%) missing values Missing
MontoAnual__c has 64395 (> 99.9%) missing values Missing
OtrosIngresos__c has 59709 (92.7%) missing values Missing
Profesion__pc has 64404 (100.0%) missing values Missing
EgresosAnuales__c has 58200 (90.4%) missing values Missing
EstadoCivil__pc has 9823 (15.3%) missing values Missing
Genero__pc has 9823 (15.3%) missing values Missing
edad has 56027 (87.0%) missing values Missing
OtrosIngresos__c is highly skewed (γ1 = 41.12936967) Skewed
Profesion__pc is an unsupported type, check if it needs cleaning or further analysis Unsupported
OtrosIngresos__c has 4402 (6.8%) zeros Zeros

Reproduction

Analysis started2022-05-07 20:48:01.468763
Analysis finished2022-05-07 20:54:22.011683
Duration6 minutes and 20.54 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

CodigoTipoAsegurado__c
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.6 MiB
1
61551 
4
 
1256
2
 
838
3
 
759

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters64404
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row4

Common Values

ValueCountFrequency (%)
161551
95.6%
41256
 
2.0%
2838
 
1.3%
3759
 
1.2%

Length

2022-05-07T15:54:22.052191image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-07T15:54:22.125203image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
161551
95.6%
41256
 
2.0%
2838
 
1.3%
3759
 
1.2%

Most occurring characters

ValueCountFrequency (%)
161551
95.6%
41256
 
2.0%
2838
 
1.3%
3759
 
1.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number64404
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
161551
95.6%
41256
 
2.0%
2838
 
1.3%
3759
 
1.2%

Most occurring scripts

ValueCountFrequency (%)
Common64404
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
161551
95.6%
41256
 
2.0%
2838
 
1.3%
3759
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII64404
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
161551
95.6%
41256
 
2.0%
2838
 
1.3%
3759
 
1.2%

PuntoVenta__c
Real number (ℝ≥0)

HIGH CORRELATION

Distinct1394
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7529.805261
Minimum1
Maximum99999
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size503.3 KiB
2022-05-07T15:54:22.200716image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile301
Q11778
median9672
Q312285
95-th percentile12845
Maximum99999
Range99998
Interquartile range (IQR)10507

Descriptive statistics

Standard deviation5009.310331
Coefficient of variation (CV)0.6652642609
Kurtosis0.1363023156
Mean7529.805261
Median Absolute Deviation (MAD)2946
Skewness-0.2235605019
Sum484949578
Variance25093190
MonotonicityNot monotonic
2022-05-07T15:54:22.289732image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
33011863
 
2.9%
121901647
 
2.6%
70021362
 
2.1%
11491065
 
1.7%
19979
 
1.5%
9721977
 
1.5%
610963
 
1.5%
12254836
 
1.3%
1503740
 
1.1%
103736
 
1.1%
Other values (1384)53236
82.7%
ValueCountFrequency (%)
11
 
< 0.1%
31
 
< 0.1%
5188
0.3%
71
 
< 0.1%
811
 
< 0.1%
95
 
< 0.1%
111
 
< 0.1%
131
 
< 0.1%
1438
 
0.1%
158
 
< 0.1%
ValueCountFrequency (%)
999991
 
< 0.1%
2000111
< 0.1%
130934
 
< 0.1%
130888
< 0.1%
130831
 
< 0.1%
130803
 
< 0.1%
130763
 
< 0.1%
130749
< 0.1%
1307217
< 0.1%
130713
 
< 0.1%

tipo_poliza_name
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.1 MiB
s.o.a.t
51505 
individual
 
4596
responsabilidad civil
 
2602
otras
 
1564
de daños tradicional
 
1136
Other values (9)
 
3001

Length

Max length45
Median length7
Mean length8.543397926
Min length5

Characters and Unicode

Total characters550229
Distinct characters22
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowde daños tradicional
2nd rowde daños
3rd rowflotante
4th rowresponsabilidad civil
5th rowcolectiva

Common Values

ValueCountFrequency (%)
s.o.a.t51505
80.0%
individual4596
 
7.1%
responsabilidad civil2602
 
4.0%
otras1564
 
2.4%
de daños tradicional1136
 
1.8%
de deudores hipotecarios743
 
1.2%
de daños470
 
0.7%
flotante456
 
0.7%
todo riesgo de obras civiles daños materiales416
 
0.6%
global sector privado412
 
0.6%
Other values (4)504
 
0.8%

Length

2022-05-07T15:54:22.375247image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
s.o.a.t51505
68.7%
individual4596
 
6.1%
de2856
 
3.8%
responsabilidad2602
 
3.5%
civil2602
 
3.5%
daños2022
 
2.7%
otras1564
 
2.1%
tradicional1136
 
1.5%
deudores743
 
1.0%
hipotecarios743
 
1.0%
Other values (18)4568
 
6.1%

Most occurring characters

ValueCountFrequency (%)
.154515
28.1%
a71273
13.0%
o64942
11.8%
s64154
11.7%
t57463
 
10.4%
i30394
 
5.5%
d22882
 
4.2%
l13576
 
2.5%
e10717
 
1.9%
10533
 
1.9%
Other values (12)49780
 
9.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter385181
70.0%
Other Punctuation154515
28.1%
Space Separator10533
 
1.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a71273
18.5%
o64942
16.9%
s64154
16.7%
t57463
14.9%
i30394
7.9%
d22882
 
5.9%
l13576
 
3.5%
e10717
 
2.8%
n9203
 
2.4%
r9182
 
2.4%
Other values (10)31395
8.2%
Other Punctuation
ValueCountFrequency (%)
.154515
100.0%
Space Separator
ValueCountFrequency (%)
10533
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin385181
70.0%
Common165048
30.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a71273
18.5%
o64942
16.9%
s64154
16.7%
t57463
14.9%
i30394
7.9%
d22882
 
5.9%
l13576
 
3.5%
e10717
 
2.8%
n9203
 
2.4%
r9182
 
2.4%
Other values (10)31395
8.2%
Common
ValueCountFrequency (%)
.154515
93.6%
10533
 
6.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII548207
99.6%
None2022
 
0.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.154515
28.2%
a71273
13.0%
o64942
11.8%
s64154
11.7%
t57463
 
10.5%
i30394
 
5.5%
d22882
 
4.2%
l13576
 
2.5%
e10717
 
2.0%
10533
 
1.9%
Other values (11)47758
 
8.7%
None
ValueCountFrequency (%)
ñ2022
100.0%

tipo_prod_desc
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.8 MiB
otras
62198 
convenios
 
1230
au excepciones
 
482
au ded unic liv
 
470
disp legales
 
24

Length

Max length15
Median length5
Mean length5.219334203
Min length5

Characters and Unicode

Total characters336146
Distinct characters17
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowotras
2nd rowotras
3rd rowotras
4th rowotras
5th rowotras

Common Values

ValueCountFrequency (%)
otras62198
96.6%
convenios1230
 
1.9%
au excepciones482
 
0.7%
au ded unic liv470
 
0.7%
disp legales24
 
< 0.1%

Length

2022-05-07T15:54:22.449260image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-07T15:54:22.522273image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
otras62198
93.8%
convenios1230
 
1.9%
au952
 
1.4%
excepciones482
 
0.7%
ded470
 
0.7%
unic470
 
0.7%
liv470
 
0.7%
disp24
 
< 0.1%
legales24
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
o65140
19.4%
s63958
19.0%
a63174
18.8%
r62198
18.5%
t62198
18.5%
n3412
 
1.0%
e3194
 
1.0%
i2676
 
0.8%
c2664
 
0.8%
1916
 
0.6%
Other values (7)5616
 
1.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter334230
99.4%
Space Separator1916
 
0.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o65140
19.5%
s63958
19.1%
a63174
18.9%
r62198
18.6%
t62198
18.6%
n3412
 
1.0%
e3194
 
1.0%
i2676
 
0.8%
c2664
 
0.8%
v1700
 
0.5%
Other values (6)3916
 
1.2%
Space Separator
ValueCountFrequency (%)
1916
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin334230
99.4%
Common1916
 
0.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
o65140
19.5%
s63958
19.1%
a63174
18.9%
r62198
18.6%
t62198
18.6%
n3412
 
1.0%
e3194
 
1.0%
i2676
 
0.8%
c2664
 
0.8%
v1700
 
0.5%
Other values (6)3916
 
1.2%
Common
ValueCountFrequency (%)
1916
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII336146
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o65140
19.4%
s63958
19.0%
a63174
18.8%
r62198
18.5%
t62198
18.5%
n3412
 
1.0%
e3194
 
1.0%
i2676
 
0.8%
c2664
 
0.8%
1916
 
0.6%
Other values (7)5616
 
1.7%

ClaseVehiculo__c
Real number (ℝ≥0)

HIGH CORRELATION

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20029.1162
Minimum1
Maximum99999
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size503.3 KiB
2022-05-07T15:54:22.586284image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q35
95-th percentile99999
Maximum99999
Range99998
Interquartile range (IQR)4

Descriptive statistics

Standard deviation40020.56009
Coefficient of variation (CV)1.998119123
Kurtosis0.2434989554
Mean20029.1162
Median Absolute Deviation (MAD)0
Skewness1.497828892
Sum1289955200
Variance1601645230
MonotonicityNot monotonic
2022-05-07T15:54:22.648296image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
146371
72.0%
9999912899
 
20.0%
52819
 
4.4%
21275
 
2.0%
3474
 
0.7%
7209
 
0.3%
6191
 
0.3%
484
 
0.1%
960
 
0.1%
822
 
< 0.1%
ValueCountFrequency (%)
146371
72.0%
21275
 
2.0%
3474
 
0.7%
484
 
0.1%
52819
 
4.4%
6191
 
0.3%
7209
 
0.3%
822
 
< 0.1%
960
 
0.1%
9999912899
 
20.0%
ValueCountFrequency (%)
9999912899
 
20.0%
960
 
0.1%
822
 
< 0.1%
7209
 
0.3%
6191
 
0.3%
52819
 
4.4%
484
 
0.1%
3474
 
0.7%
21275
 
2.0%
146371
72.0%

MarcaVehiculo__c
Categorical

CONSTANT
HIGH CORRELATION
MISSING
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing12899
Missing (%)20.0%
Memory size3.4 MiB
97.0
51505 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters206020
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row97.0
2nd row97.0
3rd row97.0
4th row97.0
5th row97.0

Common Values

ValueCountFrequency (%)
97.051505
80.0%
(Missing)12899
 
20.0%

Length

2022-05-07T15:54:22.712807image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-07T15:54:22.774817image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
97.051505
100.0%

Most occurring characters

ValueCountFrequency (%)
951505
25.0%
751505
25.0%
.51505
25.0%
051505
25.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number154515
75.0%
Other Punctuation51505
 
25.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
951505
33.3%
751505
33.3%
051505
33.3%
Other Punctuation
ValueCountFrequency (%)
.51505
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common206020
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
951505
25.0%
751505
25.0%
.51505
25.0%
051505
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII206020
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
951505
25.0%
751505
25.0%
.51505
25.0%
051505
25.0%

MdeloVehiculo__c
Categorical

CONSTANT
HIGH CORRELATION
MISSING
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing12899
Missing (%)20.0%
Memory size3.5 MiB
999.0
51505 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters257525
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row999.0
2nd row999.0
3rd row999.0
4th row999.0
5th row999.0

Common Values

ValueCountFrequency (%)
999.051505
80.0%
(Missing)12899
 
20.0%

Length

2022-05-07T15:54:22.825327image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-07T15:54:22.890339image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
999.051505
100.0%

Most occurring characters

ValueCountFrequency (%)
9154515
60.0%
.51505
 
20.0%
051505
 
20.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number206020
80.0%
Other Punctuation51505
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
9154515
75.0%
051505
 
25.0%
Other Punctuation
ValueCountFrequency (%)
.51505
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common257525
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
9154515
60.0%
.51505
 
20.0%
051505
 
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII257525
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9154515
60.0%
.51505
 
20.0%
051505
 
20.0%

TipoVehiculo__c
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.6 MiB
0
51505 
99999
12899 

Length

Max length5
Median length1
Mean length1.801130365
Min length1

Characters and Unicode

Total characters116000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row99999
2nd row99999
3rd row99999
4th row99999
5th row99999

Common Values

ValueCountFrequency (%)
051505
80.0%
9999912899
 
20.0%

Length

2022-05-07T15:54:22.948348image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-07T15:54:23.021360image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
051505
80.0%
9999912899
 
20.0%

Most occurring characters

ValueCountFrequency (%)
964495
55.6%
051505
44.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number116000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
964495
55.6%
051505
44.4%

Most occurring scripts

ValueCountFrequency (%)
Common116000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
964495
55.6%
051505
44.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII116000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
964495
55.6%
051505
44.4%

NumeroPoliza__c
Real number (ℝ≥0)

HIGH CORRELATION

Distinct60441
Distinct (%)93.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3954909.092
Minimum1000002
Maximum4845222
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size503.3 KiB
2022-05-07T15:54:23.094374image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1000002
5-th percentile1003854.15
Q14139160.75
median4587055.5
Q34618324.25
95-th percentile4631331.85
Maximum4845222
Range3845220
Interquartile range (IQR)479163.5

Descriptive statistics

Standard deviation1154304.525
Coefficient of variation (CV)0.2918662601
Kurtosis2.046573422
Mean3954909.092
Median Absolute Deviation (MAD)44192
Skewness-1.875809211
Sum2.547119651 × 1011
Variance1.332418937 × 1012
MonotonicityNot monotonic
2022-05-07T15:54:23.185890image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100426111
 
< 0.1%
100117611
 
< 0.1%
100118211
 
< 0.1%
100425911
 
< 0.1%
100121410
 
< 0.1%
100117910
 
< 0.1%
100427610
 
< 0.1%
100423910
 
< 0.1%
100048910
 
< 0.1%
10012579
 
< 0.1%
Other values (60431)64301
99.8%
ValueCountFrequency (%)
10000025
< 0.1%
10000045
< 0.1%
10000063
< 0.1%
10000071
 
< 0.1%
10000095
< 0.1%
10000104
< 0.1%
10000131
 
< 0.1%
10000143
< 0.1%
10000153
< 0.1%
10000164
< 0.1%
ValueCountFrequency (%)
48452221
< 0.1%
46634191
< 0.1%
46613421
< 0.1%
46494621
< 0.1%
46494061
< 0.1%
46457191
< 0.1%
46457181
< 0.1%
46405931
< 0.1%
46347891
< 0.1%
46347881
< 0.1%

FechaInicioVigencia__ctrim
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.9 MiB
02-2021
61209 
01-2021
 
3195

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters450828
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row02-2021
2nd row02-2021
3rd row02-2021
4th row01-2021
5th row02-2021

Common Values

ValueCountFrequency (%)
02-202161209
95.0%
01-20213195
 
5.0%

Length

2022-05-07T15:54:23.273405image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-07T15:54:23.339917image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
02-202161209
95.0%
01-20213195
 
5.0%

Most occurring characters

ValueCountFrequency (%)
2190017
42.1%
0128808
28.6%
167599
 
15.0%
-64404
 
14.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number386424
85.7%
Dash Punctuation64404
 
14.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2190017
49.2%
0128808
33.3%
167599
 
17.5%
Dash Punctuation
ValueCountFrequency (%)
-64404
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common450828
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2190017
42.1%
0128808
28.6%
167599
 
15.0%
-64404
 
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII450828
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2190017
42.1%
0128808
28.6%
167599
 
15.0%
-64404
 
14.3%

churn
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.6 MiB
0
54055 
1
10349 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters64404
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
054055
83.9%
110349
 
16.1%

Length

2022-05-07T15:54:23.396426image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-07T15:54:23.464439image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
054055
83.9%
110349
 
16.1%

Most occurring characters

ValueCountFrequency (%)
054055
83.9%
110349
 
16.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number64404
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
054055
83.9%
110349
 
16.1%

Most occurring scripts

ValueCountFrequency (%)
Common64404
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
054055
83.9%
110349
 
16.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII64404
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
054055
83.9%
110349
 
16.1%

n_prod_prev
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct5
Distinct (%)0.2%
Missing61750
Missing (%)95.9%
Memory size2.5 MiB
1.0
1071 
3.0
1021 
8.0
474 
2.0
 
80
4.0
 
8

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters7962
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row8.0
2nd row8.0
3rd row8.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.01071
 
1.7%
3.01021
 
1.6%
8.0474
 
0.7%
2.080
 
0.1%
4.08
 
< 0.1%
(Missing)61750
95.9%

Length

2022-05-07T15:54:23.522449image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-07T15:54:23.589961image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
1.01071
40.4%
3.01021
38.5%
8.0474
17.9%
2.080
 
3.0%
4.08
 
0.3%

Most occurring characters

ValueCountFrequency (%)
.2654
33.3%
02654
33.3%
11071
13.5%
31021
 
12.8%
8474
 
6.0%
280
 
1.0%
48
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number5308
66.7%
Other Punctuation2654
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
02654
50.0%
11071
20.2%
31021
 
19.2%
8474
 
8.9%
280
 
1.5%
48
 
0.2%
Other Punctuation
ValueCountFrequency (%)
.2654
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common7962
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.2654
33.3%
02654
33.3%
11071
13.5%
31021
 
12.8%
8474
 
6.0%
280
 
1.0%
48
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII7962
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.2654
33.3%
02654
33.3%
11071
13.5%
31021
 
12.8%
8474
 
6.0%
280
 
1.0%
48
 
0.1%

total_siniestros
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct32
Distinct (%)0.8%
Missing60246
Missing (%)93.5%
Infinite0
Infinite (%)0.0%
Mean48.32924483
Minimum1
Maximum940
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size503.3 KiB
2022-05-07T15:54:23.658473image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median20
Q338
95-th percentile150
Maximum940
Range939
Interquartile range (IQR)35

Descriptive statistics

Standard deviation63.0592206
Coefficient of variation (CV)1.304783901
Kurtosis27.59590949
Mean48.32924483
Median Absolute Deviation (MAD)18
Skewness2.855234945
Sum200953
Variance3976.465303
MonotonicityNot monotonic
2022-05-07T15:54:23.734986image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=32)
ValueCountFrequency (%)
150974
 
1.5%
1786
 
1.2%
37548
 
0.9%
38475
 
0.7%
16247
 
0.4%
3238
 
0.4%
2218
 
0.3%
4142
 
0.2%
896
 
0.1%
565
 
0.1%
Other values (22)369
 
0.6%
(Missing)60246
93.5%
ValueCountFrequency (%)
1786
1.2%
2218
 
0.3%
3238
 
0.4%
4142
 
0.2%
565
 
0.1%
633
 
0.1%
752
 
0.1%
896
 
0.1%
927
 
< 0.1%
1031
 
< 0.1%
ValueCountFrequency (%)
9403
 
< 0.1%
150974
1.5%
902
 
< 0.1%
804
 
< 0.1%
527
 
< 0.1%
511
 
< 0.1%
454
 
< 0.1%
38475
0.7%
37548
0.9%
362
 
< 0.1%

total_pagado_smmlv
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct386
Distinct (%)9.3%
Missing60246
Missing (%)93.5%
Infinite0
Infinite (%)0.0%
Mean2670.067165
Minimum0
Maximum55871.95629
Zeros627
Zeros (%)1.0%
Negative0
Negative (%)0.0%
Memory size503.3 KiB
2022-05-07T15:54:23.827004image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q117.35001589
median370.2541766
Q33065.269972
95-th percentile8833.286309
Maximum55871.95629
Range55871.95629
Interquartile range (IQR)3047.919956

Descriptive statistics

Standard deviation3785.493099
Coefficient of variation (CV)1.417752013
Kurtosis18.08380843
Mean2670.067165
Median Absolute Deviation (MAD)370.2541766
Skewness2.27192553
Sum11102139.27
Variance14329958
MonotonicityNot monotonic
2022-05-07T15:54:23.918519image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8833.286309974
 
1.5%
0627
 
1.0%
1306.925184527
 
0.8%
3065.269972474
 
0.7%
57.0596277218
 
0.3%
17.35001589126
 
0.2%
38.7053598681
 
0.1%
207.890887153
 
0.1%
87.7773166948
 
0.1%
50.8582310945
 
0.1%
Other values (376)985
 
1.5%
(Missing)60246
93.5%
ValueCountFrequency (%)
0627
1.0%
0.16790493611
 
< 0.1%
0.2254200762
 
< 0.1%
0.24393578171
 
< 0.1%
0.25577693981
 
< 0.1%
0.28442774341
 
< 0.1%
0.29319799321
 
< 0.1%
0.30737480271
 
< 0.1%
0.30774022981
 
< 0.1%
0.3333311321
 
< 0.1%
ValueCountFrequency (%)
55871.956292
 
< 0.1%
22272.95173
 
< 0.1%
8833.286309974
1.5%
4385.6987732
 
< 0.1%
3065.269972474
0.7%
2345.7519033
 
< 0.1%
2265.8611021
 
< 0.1%
1556.7514481
 
< 0.1%
1446.6289351
 
< 0.1%
1306.925184527
0.8%

anios_ultimo_siniestro
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct221
Distinct (%)5.3%
Missing60246
Missing (%)93.5%
Infinite0
Infinite (%)0.0%
Mean0.2722416599
Minimum0.002739726027
Maximum9.465753425
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size503.3 KiB
2022-05-07T15:54:24.006534image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0.002739726027
5-th percentile0.002739726027
Q10.005479452055
median0.01095890411
Q30.08493150685
95-th percentile1.606986301
Maximum9.465753425
Range9.463013699
Interquartile range (IQR)0.07945205479

Descriptive statistics

Standard deviation0.9555272766
Coefficient of variation (CV)3.509849583
Kurtosis34.33231817
Mean0.2722416599
Median Absolute Deviation (MAD)0.008219178082
Skewness5.496456599
Sum1131.980822
Variance0.9130323764
MonotonicityNot monotonic
2022-05-07T15:54:24.099551image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.002739726027990
 
1.5%
0.008219178082559
 
0.9%
0.01095890411557
 
0.9%
0.005479452055395
 
0.6%
0.09315068493137
 
0.2%
0.1095890411101
 
0.2%
0.0712328767170
 
0.1%
0.0164383561656
 
0.1%
0.0767123287755
 
0.1%
0.0328767123343
 
0.1%
Other values (211)1195
 
1.9%
(Missing)60246
93.5%
ValueCountFrequency (%)
0.002739726027990
1.5%
0.005479452055395
 
0.6%
0.008219178082559
0.9%
0.01095890411557
0.9%
0.0136986301435
 
0.1%
0.0164383561656
 
0.1%
0.0219178082213
 
< 0.1%
0.0246575342532
 
< 0.1%
0.027397260276
 
< 0.1%
0.030136986315
 
< 0.1%
ValueCountFrequency (%)
9.4657534251
 
< 0.1%
8.758904116
< 0.1%
85
< 0.1%
7.5315068492
 
< 0.1%
7.1780821926
< 0.1%
7.1013698633
 
< 0.1%
7.0054794523
 
< 0.1%
6.9945205481
 
< 0.1%
6.23561643810
< 0.1%
6.0082191787
< 0.1%

Activos__c
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct1655
Distinct (%)26.7%
Missing58200
Missing (%)90.4%
Infinite0
Infinite (%)0.0%
Mean513144079.7
Minimum0
Maximum8.31 × 1010
Zeros50
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size503.3 KiB
2022-05-07T15:54:24.195068image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3000000
Q150000000
median123219000
Q3331112500
95-th percentile1558311000
Maximum8.31 × 1010
Range8.31 × 1010
Interquartile range (IQR)281112500

Descriptive statistics

Standard deviation2312679344
Coefficient of variation (CV)4.506881079
Kurtosis395.0702999
Mean513144079.7
Median Absolute Deviation (MAD)98219000
Skewness16.36979187
Sum3.183545871 × 1012
Variance5.348485747 × 1018
MonotonicityNot monotonic
2022-05-07T15:54:24.284083image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100000000205
 
0.3%
200000000179
 
0.3%
50000000174
 
0.3%
80000000163
 
0.3%
150000000158
 
0.2%
120000000127
 
0.2%
60000000125
 
0.2%
300000000116
 
0.2%
10000000109
 
0.2%
30000000102
 
0.2%
Other values (1645)4746
 
7.4%
(Missing)58200
90.4%
ValueCountFrequency (%)
050
0.1%
190
0.1%
238
0.1%
203
 
< 0.1%
100001
 
< 0.1%
1072541
 
< 0.1%
1270001
 
< 0.1%
2000002
 
< 0.1%
2300001
 
< 0.1%
3000001
 
< 0.1%
ValueCountFrequency (%)
8.31 × 10101
 
< 0.1%
5.643561885 × 10101
 
< 0.1%
3.3401125 × 10103
< 0.1%
3.2934155 × 10107
< 0.1%
2.8207083 × 10101
 
< 0.1%
2.379921321 × 10101
 
< 0.1%
1.9483034 × 10103
< 0.1%
1.9263183 × 10103
< 0.1%
1.7267744 × 10107
< 0.1%
1.56067031 × 10105
< 0.1%

AnnualRevenue
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct1830
Distinct (%)29.5%
Missing58200
Missing (%)90.4%
Infinite0
Infinite (%)0.0%
Mean304876044.3
Minimum0
Maximum7.2539149 × 1010
Zeros67
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size503.3 KiB
2022-05-07T15:54:24.373599image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile5500000
Q124000000
median43000000
Q396000000
95-th percentile769247174.6
Maximum7.2539149 × 1010
Range7.2539149 × 1010
Interquartile range (IQR)72000000

Descriptive statistics

Standard deviation2063906208
Coefficient of variation (CV)6.76965687
Kurtosis393.1213775
Mean304876044.3
Median Absolute Deviation (MAD)26673126
Skewness17.16512571
Sum1.891450979 × 1012
Variance4.259708835 × 1018
MonotonicityNot monotonic
2022-05-07T15:54:24.462614image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
36000000252
 
0.4%
24000000209
 
0.3%
60000000182
 
0.3%
30000000177
 
0.3%
48000000165
 
0.3%
12000000135
 
0.2%
40000000104
 
0.2%
18000000100
 
0.2%
5000000093
 
0.1%
4200000075
 
0.1%
Other values (1820)4712
 
7.3%
(Missing)58200
90.4%
ValueCountFrequency (%)
067
0.1%
113
 
< 0.1%
522301
 
< 0.1%
907901
 
< 0.1%
2400001
 
< 0.1%
5000001
 
< 0.1%
6000001
 
< 0.1%
8350001
 
< 0.1%
8770001
 
< 0.1%
9000001
 
< 0.1%
ValueCountFrequency (%)
7.2539149 × 10101
 
< 0.1%
4.24173603 × 10102
 
< 0.1%
3.408 × 10101
 
< 0.1%
3.243391822 × 10101
 
< 0.1%
2.9458395 × 10101
 
< 0.1%
2.9375273 × 10107
< 0.1%
2.765797 × 10103
< 0.1%
2.010036 × 10101
 
< 0.1%
1.897052621 × 10105
< 0.1%
1.7620986 × 10101
 
< 0.1%

MontoAnual__c
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct6
Distinct (%)66.7%
Missing64395
Missing (%)> 99.9%
Infinite0
Infinite (%)0.0%
Mean2579.111111
Minimum0
Maximum20000
Zeros4
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size503.3 KiB
2022-05-07T15:54:24.540128image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median10
Q3102
95-th percentile13200
Maximum20000
Range20000
Interquartile range (IQR)102

Descriptive statistics

Standard deviation6606.381166
Coefficient of variation (CV)2.561495369
Kurtosis8.4219912
Mean2579.111111
Median Absolute Deviation (MAD)10
Skewness2.882336495
Sum23212
Variance43644272.11
MonotonicityNot monotonic
2022-05-07T15:54:24.599138image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
04
 
< 0.1%
30001
 
< 0.1%
1001
 
< 0.1%
1021
 
< 0.1%
101
 
< 0.1%
200001
 
< 0.1%
(Missing)64395
> 99.9%
ValueCountFrequency (%)
04
< 0.1%
101
 
< 0.1%
1001
 
< 0.1%
1021
 
< 0.1%
30001
 
< 0.1%
200001
 
< 0.1%
ValueCountFrequency (%)
200001
 
< 0.1%
30001
 
< 0.1%
1021
 
< 0.1%
1001
 
< 0.1%
101
 
< 0.1%
04
< 0.1%

OtrosIngresos__c
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
SKEWED
ZEROS

Distinct142
Distinct (%)3.0%
Missing59709
Missing (%)92.7%
Infinite0
Infinite (%)0.0%
Mean4471255.486
Minimum0
Maximum3479487967
Zeros4402
Zeros (%)6.8%
Negative0
Negative (%)0.0%
Memory size503.3 KiB
2022-05-07T15:54:24.683153image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile4200000
Maximum3479487967
Range3479487967
Interquartile range (IQR)0

Descriptive statistics

Standard deviation61763776.62
Coefficient of variation (CV)13.8135199
Kurtosis2171.466619
Mean4471255.486
Median Absolute Deviation (MAD)0
Skewness41.12936967
Sum2.099254451 × 1010
Variance3.814764103 × 1015
MonotonicityNot monotonic
2022-05-07T15:54:24.769669image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
04402
 
6.8%
1500000011
 
< 0.1%
120000009
 
< 0.1%
100000009
 
< 0.1%
50000009
 
< 0.1%
200000008
 
< 0.1%
60000008
 
< 0.1%
1656440007
 
< 0.1%
20000007
 
< 0.1%
360000006
 
< 0.1%
Other values (132)219
 
0.3%
(Missing)59709
92.7%
ValueCountFrequency (%)
04402
6.8%
11
 
< 0.1%
22
 
< 0.1%
40004
 
< 0.1%
90001
 
< 0.1%
370001
 
< 0.1%
2270002
 
< 0.1%
2740001
 
< 0.1%
5000001
 
< 0.1%
5290001
 
< 0.1%
ValueCountFrequency (%)
34794879671
 
< 0.1%
8938930002
< 0.1%
7206250003
< 0.1%
6554900001
 
< 0.1%
5179690002
< 0.1%
4683420003
< 0.1%
1962181201
 
< 0.1%
1948740001
 
< 0.1%
1946010004
< 0.1%
1879690003
< 0.1%

Profesion__pc
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing64404
Missing (%)100.0%
Memory size503.3 KiB

EgresosAnuales__c
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct1374
Distinct (%)22.1%
Missing58200
Missing (%)90.4%
Infinite0
Infinite (%)0.0%
Mean245358666.2
Minimum0
Maximum7.1738788 × 1010
Zeros68
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size503.3 KiB
2022-05-07T15:54:24.863685image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2000000
Q112000000
median25000000
Q358000000
95-th percentile592034849.4
Maximum7.1738788 × 1010
Range7.1738788 × 1010
Interquartile range (IQR)46000000

Descriptive statistics

Standard deviation1938474429
Coefficient of variation (CV)7.900574531
Kurtosis463.440573
Mean245358666.2
Median Absolute Deviation (MAD)15400000
Skewness18.7591844
Sum1.522205165 × 1012
Variance3.757683113 × 1018
MonotonicityNot monotonic
2022-05-07T15:54:24.952701image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12000000272
 
0.4%
30000000247
 
0.4%
18000000225
 
0.3%
24000000214
 
0.3%
20000000203
 
0.3%
10000000154
 
0.2%
40000000145
 
0.2%
36000000141
 
0.2%
15000000132
 
0.2%
25000000108
 
0.2%
Other values (1364)4363
 
6.8%
(Missing)58200
90.4%
ValueCountFrequency (%)
068
0.1%
139
0.1%
2000001
 
< 0.1%
2500001
 
< 0.1%
3000002
 
< 0.1%
3395611
 
< 0.1%
4000002
 
< 0.1%
45000014
 
< 0.1%
4973951
 
< 0.1%
50000012
 
< 0.1%
ValueCountFrequency (%)
7.1738788 × 10101
 
< 0.1%
4.033890964 × 10102
 
< 0.1%
3.472297457 × 10101
 
< 0.1%
3.36 × 10101
 
< 0.1%
2.8430656 × 10107
< 0.1%
2.7557799 × 10101
 
< 0.1%
2.5171626 × 10103
< 0.1%
1.9746108 × 10101
 
< 0.1%
1.663542953 × 10101
 
< 0.1%
1.6135595 × 10101
 
< 0.1%

EstadoCivil__pc
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct8
Distinct (%)< 0.1%
Missing9823
Missing (%)15.3%
Memory size3.5 MiB
N A
41989 
SOLTERO
8224 
CASADO
 
2709
OTRO
 
1476
UNIDO
 
120
Other values (3)
 
63

Length

Max length10
Median length3
Mean length3.787691688
Min length3

Characters and Unicode

Total characters206736
Distinct characters15
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSOLTERO
2nd rowOTRO
3rd rowSOLTERO
4th rowN A
5th rowOTRO

Common Values

ValueCountFrequency (%)
N A41989
65.2%
SOLTERO8224
 
12.8%
CASADO2709
 
4.2%
OTRO1476
 
2.3%
UNIDO120
 
0.2%
VIUDO27
 
< 0.1%
SEPARADO26
 
< 0.1%
DIVORCIADO10
 
< 0.1%
(Missing)9823
 
15.3%

Length

2022-05-07T15:54:25.042717image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-07T15:54:25.124231image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
n41989
43.5%
a41989
43.5%
soltero8224
 
8.5%
casado2709
 
2.8%
otro1476
 
1.5%
unido120
 
0.1%
viudo27
 
< 0.1%
separado26
 
< 0.1%
divorciado10
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
A47469
23.0%
N42109
20.4%
41989
20.3%
O22302
10.8%
S10959
 
5.3%
R9736
 
4.7%
T9700
 
4.7%
E8250
 
4.0%
L8224
 
4.0%
D2902
 
1.4%
Other values (5)3096
 
1.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter164747
79.7%
Space Separator41989
 
20.3%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A47469
28.8%
N42109
25.6%
O22302
13.5%
S10959
 
6.7%
R9736
 
5.9%
T9700
 
5.9%
E8250
 
5.0%
L8224
 
5.0%
D2902
 
1.8%
C2719
 
1.7%
Other values (4)377
 
0.2%
Space Separator
ValueCountFrequency (%)
41989
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin164747
79.7%
Common41989
 
20.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
A47469
28.8%
N42109
25.6%
O22302
13.5%
S10959
 
6.7%
R9736
 
5.9%
T9700
 
5.9%
E8250
 
5.0%
L8224
 
5.0%
D2902
 
1.8%
C2719
 
1.7%
Other values (4)377
 
0.2%
Common
ValueCountFrequency (%)
41989
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII206736
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A47469
23.0%
N42109
20.4%
41989
20.3%
O22302
10.8%
S10959
 
5.3%
R9736
 
4.7%
T9700
 
4.7%
E8250
 
4.0%
L8224
 
4.0%
D2902
 
1.4%
Other values (5)3096
 
1.5%

Genero__pc
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct3
Distinct (%)< 0.1%
Missing9823
Missing (%)15.3%
Memory size3.5 MiB
N A
40486 
MASCULINO
11399 
FEMENINO
 
2696

Length

Max length9
Median length3
Mean length4.500045803
Min length3

Characters and Unicode

Total characters245617
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMASCULINO
2nd rowMASCULINO
3rd rowFEMENINO
4th rowN A
5th rowMASCULINO

Common Values

ValueCountFrequency (%)
N A40486
62.9%
MASCULINO11399
 
17.7%
FEMENINO2696
 
4.2%
(Missing)9823
 
15.3%

Length

2022-05-07T15:54:25.199744image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-05-07T15:54:25.271758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
n40486
42.6%
a40486
42.6%
masculino11399
 
12.0%
femenino2696
 
2.8%

Most occurring characters

ValueCountFrequency (%)
N57277
23.3%
A51885
21.1%
40486
16.5%
M14095
 
5.7%
I14095
 
5.7%
O14095
 
5.7%
S11399
 
4.6%
C11399
 
4.6%
U11399
 
4.6%
L11399
 
4.6%
Other values (2)8088
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter205131
83.5%
Space Separator40486
 
16.5%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N57277
27.9%
A51885
25.3%
M14095
 
6.9%
I14095
 
6.9%
O14095
 
6.9%
S11399
 
5.6%
C11399
 
5.6%
U11399
 
5.6%
L11399
 
5.6%
E5392
 
2.6%
Space Separator
ValueCountFrequency (%)
40486
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin205131
83.5%
Common40486
 
16.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
N57277
27.9%
A51885
25.3%
M14095
 
6.9%
I14095
 
6.9%
O14095
 
6.9%
S11399
 
5.6%
C11399
 
5.6%
U11399
 
5.6%
L11399
 
5.6%
E5392
 
2.6%
Common
ValueCountFrequency (%)
40486
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII245617
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N57277
23.3%
A51885
21.1%
40486
16.5%
M14095
 
5.7%
I14095
 
5.7%
O14095
 
5.7%
S11399
 
4.6%
C11399
 
4.6%
U11399
 
4.6%
L11399
 
4.6%
Other values (2)8088
 
3.3%

edad
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct5287
Distinct (%)63.1%
Missing56027
Missing (%)87.0%
Infinite0
Infinite (%)0.0%
Mean52.9122542
Minimum1.501369863
Maximum122.4273973
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size503.3 KiB
2022-05-07T15:54:25.340269image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1.501369863
5-th percentile26.01917808
Q136.07945205
median47.43835616
Q360.60273973
95-th percentile122.4273973
Maximum122.4273973
Range120.9260274
Interquartile range (IQR)24.52328767

Descriptive statistics

Standard deviation24.95405359
Coefficient of variation (CV)0.4716119917
Kurtosis2.231559507
Mean52.9122542
Median Absolute Deviation (MAD)12.11780822
Skewness1.575489523
Sum443245.9534
Variance622.7047906
MonotonicityNot monotonic
2022-05-07T15:54:25.433785image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
122.4273973681
 
1.1%
41.9068493216
 
< 0.1%
76.210958915
 
< 0.1%
72.9671232914
 
< 0.1%
42.3753424713
 
< 0.1%
52.3808219212
 
< 0.1%
57.6246575312
 
< 0.1%
48.3780821912
 
< 0.1%
42.0986301411
 
< 0.1%
56.824657538
 
< 0.1%
Other values (5277)7583
 
11.8%
(Missing)56027
87.0%
ValueCountFrequency (%)
1.5013698631
< 0.1%
2.3260273971
< 0.1%
5.5369863011
< 0.1%
5.5534246581
< 0.1%
6.2794520551
< 0.1%
7.0931506851
< 0.1%
7.3178082191
< 0.1%
7.5041095892
< 0.1%
7.586301371
< 0.1%
7.7397260271
< 0.1%
ValueCountFrequency (%)
122.4273973681
1.1%
112.71506851
 
< 0.1%
102.98904111
 
< 0.1%
94.964383561
 
< 0.1%
92.142465753
 
< 0.1%
90.841095892
 
< 0.1%
90.043835621
 
< 0.1%
89.712328772
 
< 0.1%
89.534246581
 
< 0.1%
88.786301372
 
< 0.1%

Interactions

2022-05-07T15:54:10.580673image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:48:06.193594image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:49:14.944386image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:50:08.020018image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:28.790824image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:33.238106image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:37.729397image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:42.169677image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:49.328436image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:56.433685image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:57.216823image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:54:03.289391image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:54:15.043958image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:48:16.384386image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:49:22.041432image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:50:56.923118image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:30.048046image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:34.540335image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:38.986117image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:45.060186image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:52.166936image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:56.498197image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:59.737267image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:54:06.154895image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:54:15.138975image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:48:19.579448image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:49:22.257970image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:51:35.666931image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:30.129060image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:34.622350image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:39.070132image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:45.146701image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:52.258451image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:56.553206image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:59.825282image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:54:06.244411image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:54:20.012331image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:49:06.260657image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:50:07.150865image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:52:54.649820image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:32.568489image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:37.086783image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:41.537066image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:48.649817image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:55.749065image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:56.618718image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:54:02.592768image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:54:09.844044image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:54:20.082844image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:49:06.845760image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:50:07.228379image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:52:56.654673image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:32.641001image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:37.167797image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:41.607078image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:48.722830image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:55.820578image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:56.678228image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:54:02.669282image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:54:09.929559image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:54:20.150856image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:49:07.433363image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:50:07.308893image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:52:58.642523image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:32.719515image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:37.248311image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:41.681591image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:48.793342image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:55.890590image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:56.737239image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:54:02.741295image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:54:09.999071image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:54:20.218868image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:49:08.020966image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:50:07.386406image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:00.629872image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:32.788527image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:37.318324image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:41.751604image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:48.862854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:55.961603image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:56.796749image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:54:02.812807image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:54:10.067083image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:54:20.292381image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:49:09.279688image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:50:07.463920image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:03.442867image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:32.865041image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:37.386336image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:41.821616image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:48.934867image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:56.038115image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:56.872763image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:54:02.889821image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:54:10.142596image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:54:20.369895image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:49:10.528408image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:50:07.546935image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:06.292368image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:32.945055image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:37.454348image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:41.887627image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:49.009380image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:56.119130image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:56.938774image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:54:02.973835image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:54:10.224110image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:54:20.432406image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:49:10.600921image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:50:07.616447image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:06.384885image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:33.011067image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:37.513859image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:41.949638image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:49.084393image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:56.190644image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:57.012787image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:54:03.039848image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:54:10.296124image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:54:20.514420image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:49:11.716117image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:50:07.704462image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:22.023134image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:33.087080image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:37.585871image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:42.023651image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:49.171909image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:56.273157image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:57.080799image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:54:03.121361image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:54:10.383139image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:54:20.595935image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:49:12.953834image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:50:07.790977image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:24.908642image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:33.161093image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:37.652383image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:42.094664image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:49.248922image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:56.354171image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:53:57.150811image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:54:03.204876image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-05-07T15:54:10.484657image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Correlations

2022-05-07T15:54:25.516800image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-05-07T15:54:25.643322image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-05-07T15:54:25.779846image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-05-07T15:54:25.905368image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-05-07T15:54:26.025389image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-05-07T15:54:20.836477image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-05-07T15:54:21.235047image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-05-07T15:54:21.620115image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-05-07T15:54:21.868659image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

CodigoTipoAsegurado__cPuntoVenta__ctipo_poliza_nametipo_prod_descClaseVehiculo__cMarcaVehiculo__cMdeloVehiculo__cTipoVehiculo__cNumeroPoliza__cFechaInicioVigencia__ctrimchurnn_prod_prevtotal_siniestrostotal_pagado_smmlvanios_ultimo_siniestroActivos__cAnnualRevenueMontoAnual__cOtrosIngresos__cProfesion__pcEgresosAnuales__cEstadoCivil__pcGenero__pcedad
01805de daños tradicionalotras99999NaNNaN99999100114002-20211NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
11805de dañosotras99999NaNNaN99999100114002-20211NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
21805flotanteotras99999NaNNaN99999100114002-20211NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
311203responsabilidad civilotras99999NaNNaN99999100992301-20211NaNNaNNaNNaN5.000000e+083.000000e+08NaNNaNNaN2.500000e+08SOLTEROMASCULINO34.769863
44404colectivaotras99999NaNNaN99999303449002-20211NaN4.00.0000003.082192NaNNaNNaNNaNNaNNaNNaNNaNNaN
511002individualconvenios99999NaNNaN99999304871201-202118.038.03065.2699720.008219NaNNaNNaNNaNNaNNaNNaNNaNNaN
611002individualconvenios99999NaNNaN99999305243201-202118.038.03065.2699720.008219NaNNaNNaNNaNNaNNaNNaNNaNNaN
7115s.o.a.totras197.0999.00459329501-20210NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
8123colectivaotras99999NaNNaN99999302552802-20211NaNNaNNaNNaN1.856000e+092.251415e+09NaN1870000.0NaN2.111152e+09OTROMASCULINO41.150685
918042individualconvenios99999NaNNaN99999307555901-20210NaNNaNNaNNaN8.000000e+074.000000e+07NaN0.0NaN1.400000e+07SOLTEROFEMENINO40.542466

Last rows

CodigoTipoAsegurado__cPuntoVenta__ctipo_poliza_nametipo_prod_descClaseVehiculo__cMarcaVehiculo__cMdeloVehiculo__cTipoVehiculo__cNumeroPoliza__cFechaInicioVigencia__ctrimchurnn_prod_prevtotal_siniestrostotal_pagado_smmlvanios_ultimo_siniestroActivos__cAnnualRevenueMontoAnual__cOtrosIngresos__cProfesion__pcEgresosAnuales__cEstadoCivil__pcGenero__pcedad
6439441803responsabilidad civilotras99999NaNNaN99999100833601-202111.0NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
6439511002normalotras99999NaNNaN99999100267901-20211NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
64396123individualotras99999NaNNaN99999302543401-20210NaNNaNNaNNaN3.307350e+081.916790e+08NaN0.0NaN1.716010e+08SOLTEROMASCULINO31.389041
6439741402de dañosotras99999NaNNaN99999100117901-20211NaN12.025.8501980.073973NaNNaNNaNNaNNaNNaNNaNNaNNaN
6439813301otrasotras99999NaNNaN99999300019301-202112.0NaNNaNNaN3.429540e+091.625870e+09NaN34993000.0NaN1.289049e+09CASADOMASCULINO55.695890
6439913301otrasotras99999NaNNaN99999300019201-202112.0NaNNaNNaN3.429540e+091.625870e+09NaN34993000.0NaN1.289049e+09CASADOMASCULINO55.695890
6440013202individualconvenios99999NaNNaN99999312970001-20210NaNNaNNaNNaN1.585329e+091.150000e+08NaN3000000.0NaN6.500000e+07CASADOMASCULINO42.375342
644011404colectivaotras99999NaNNaN99999303437601-202101.0NaNNaNNaN2.527961e+092.285945e+09NaN0.0NaN1.998118e+09OTROMASCULINO46.336986
6440211820individualotras99999NaNNaN99999307590701-20210NaNNaNNaNNaN1.500000e+086.800000e+07NaN0.0NaN4.000000e+07CASADOMASCULINO39.693151
6440313303s.o.a.totras197.0999.00459516202-20210NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNN AMASCULINO27.227397